maf 格式 (Mutation Annotation Format)?MAF(Mutation
Annotation Format)是一種廣泛使用的突變註解檔格式,常見於 TCGA、ICGC
與各大癌症基因體研究專案。
每一列代表一個 somatic variant,並包含以下主要欄位:
mafVCF 可以使用 vcf2maf 將 vcf 轉 mafmaftools 中的
annovarToMaf 函數,將表格形式的 ANNOVAR 輸出轉換為
MAF。maftools?maftools 是一個用於分析並視覺化 MAF 檔案的套件。
以下指令是根據官方文檔範例撰寫,並新增個人的筆記內容,所有相關著作權屬於原作者(如下):
Mayakonda A, Lin DC, Assenov Y, Plass C, Koeffler HP. 2018. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Research. PMID: 30341162
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
if (!requireNamespace("maftools", quietly = TRUE)) {
BiocManager::install("maftools", update = FALSE, ask = FALSE)
}
# create `results/` folder for output data
if (!dir.exists("results")) dir.create("results", recursive = TRUE)
使用 read.maf 函數
library(maftools)
#path to TCGA LAML MAF file
laml.maf = system.file('extdata', 'tcga_laml.maf.gz', package = 'maftools')
#clinical information containing survival information and histology. This is optional
laml.clin = system.file('extdata', 'tcga_laml_annot.tsv', package = 'maftools')
laml = read.maf(maf = laml.maf, clinicalData = laml.clin)## -Reading
## -Validating
## -Silent variants: 475
## -Summarizing
## -Processing clinical data
## -Finished in 0.159s elapsed (0.395s cpu)
## An object of class MAF
## ID summary Mean Median
## <char> <char> <num> <num>
## 1: NCBI_Build 37 NA NA
## 2: Center genome.wustl.edu NA NA
## 3: Samples 193 NA NA
## 4: nGenes 1241 NA NA
## 5: Frame_Shift_Del 52 0.269 0
## 6: Frame_Shift_Ins 91 0.472 0
## 7: In_Frame_Del 10 0.052 0
## 8: In_Frame_Ins 42 0.218 0
## 9: Missense_Mutation 1342 6.953 7
## 10: Nonsense_Mutation 103 0.534 0
## 11: Splice_Site 92 0.477 0
## 12: total 1732 8.974 9
## Tumor_Sample_Barcode Frame_Shift_Del Frame_Shift_Ins In_Frame_Del
## <fctr> <int> <int> <int>
## 1: TCGA-AB-3009 0 5 0
## 2: TCGA-AB-2807 1 0 1
## 3: TCGA-AB-2959 0 0 0
## 4: TCGA-AB-3002 0 0 0
## 5: TCGA-AB-2849 0 1 0
## ---
## 189: TCGA-AB-2942 0 0 0
## 190: TCGA-AB-2946 0 0 0
## 191: TCGA-AB-2954 0 0 0
## 192: TCGA-AB-2982 0 0 0
## 193: TCGA-AB-2903 0 0 0
## In_Frame_Ins Missense_Mutation Nonsense_Mutation Splice_Site total
## <int> <int> <int> <int> <num>
## 1: 1 25 2 1 34
## 2: 0 16 3 4 25
## 3: 0 22 0 1 23
## 4: 0 15 1 5 21
## 5: 0 16 1 2 20
## ---
## 189: 1 0 0 0 1
## 190: 0 1 0 0 1
## 191: 0 1 0 0 1
## 192: 0 1 0 0 1
## 193: 0 0 0 0 0
## Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins
## <char> <int> <int> <int> <int>
## 1: FLT3 0 0 1 33
## 2: DNMT3A 4 0 0 0
## 3: NPM1 0 33 0 0
## 4: IDH2 0 0 0 0
## 5: IDH1 0 0 0 0
## ---
## 1237: ZNF689 0 0 0 0
## 1238: ZNF75D 0 0 0 0
## 1239: ZNF827 1 0 0 0
## 1240: ZNF99 0 0 0 0
## 1241: ZPBP 0 0 0 0
## Missense_Mutation Nonsense_Mutation Splice_Site total MutatedSamples
## <int> <int> <int> <num> <int>
## 1: 15 0 3 52 52
## 2: 39 5 6 54 48
## 3: 1 0 0 34 33
## 4: 20 0 0 20 20
## 5: 18 0 0 18 18
## ---
## 1237: 1 0 0 1 1
## 1238: 1 0 0 1 1
## 1239: 0 0 0 1 1
## 1240: 1 0 0 1 1
## 1241: 1 0 0 1 1
## AlteredSamples
## <int>
## 1: 52
## 2: 48
## 3: 33
## 4: 20
## 5: 18
## ---
## 1237: 1
## 1238: 1
## 1239: 1
## 1240: 1
## 1241: 1
## Tumor_Sample_Barcode FAB_classification days_to_last_followup
## <char> <char> <num>
## 1: TCGA-AB-2802 M4 365
## 2: TCGA-AB-2803 M3 792
## 3: TCGA-AB-2804 M3 2557
## 4: TCGA-AB-2805 M0 577
## 5: TCGA-AB-2806 M1 945
## ---
## 189: TCGA-AB-3007 M3 1581
## 190: TCGA-AB-3008 M1 822
## 191: TCGA-AB-3009 M4 577
## 192: TCGA-AB-3011 M1 1885
## 193: TCGA-AB-3012 M3 1887
## Overall_Survival_Status
## <int>
## 1: 1
## 2: 1
## 3: 0
## 4: 1
## 5: 1
## ---
## 189: 0
## 190: 1
## 191: 1
## 192: 0
## 193: 0
## [1] "Hugo_Symbol" "Entrez_Gene_Id" "Center"
## [4] "NCBI_Build" "Chromosome" "Start_Position"
## [7] "End_Position" "Strand" "Variant_Classification"
## [10] "Variant_Type" "Reference_Allele" "Tumor_Seq_Allele1"
## [13] "Tumor_Seq_Allele2" "Tumor_Sample_Barcode" "Protein_Change"
## [16] "i_TumorVAF_WU" "i_transcript_name"
plotmafSummary
可以繪製整體資料的摘要,用於快速釐清資料
oncoplot 繪製樣本中的變異基因與變異種類 (Multi_Hit
是指在同一樣本中發生多次突變的基因)
在突變類型中,單一鹼基替換(Single Nucleotide Variant, SNV)可依照其替換的化學性質分為兩大類:
1. Transition (Ti) Purine ↔︎ Purine 或 Pyrimidine ↔︎ Pyrimidine 的替換,包括: - A ↔︎ G(兩者皆為 purine) - C ↔︎ T(兩者皆為 pyrimidine)
Transition 牽涉的化學變化較小,因此通常比 transversion 更常見。
2. Transversion (Tv) Purine ↔︎ Pyrimidine 的替換,包括: - A ↔︎ C - A ↔︎ T - G ↔︎ C - G ↔︎ T
由於涉及較大的化學結構變化,發生頻率通常較低。
maftools 提供 titv() 函式來計算 Transition
與 Transversion 的數量與比例,
並可利用 plotTiTv() 將結果視覺化。
laml.titv = titv(maf = laml, plot = FALSE, useSyn = TRUE)
#plot titv summary
plotTiTv(res = laml.titv)#lollipop plot for DNMT3A, which is one of the most frequent mutated gene in Leukemia.
lollipopPlot(
maf = laml,
gene = 'DNMT3A',
AACol = 'Protein_Change',
showMutationRate = TRUE,
labelPos = 882
)## 3 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## <char> <char> <char> <num>
## 1: DNMT3A NM_022552 NP_072046 912
## 2: DNMT3A NM_153759 NP_715640 723
## 3: DNMT3A NM_175629 NP_783328 912
## Using longer transcript NM_022552 for now.
## Removed 3 mutations for which AA position was not available
資料應包含變異位置(pos)與變異數量(count)兩個欄位
#example data
my_data = data.frame(pos = sample.int(912, 15, replace = TRUE), count = sample.int(30, 15, replace = TRUE))
head(my_data)## pos count
## 1 131 16
## 2 655 6
## 3 150 26
## 4 586 10
## 5 163 17
## 6 153 2
## 3 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## <char> <char> <char> <num>
## 1: DNMT3A NM_022552 NP_072046 912
## 2: DNMT3A NM_153759 NP_715640 723
## 3: DNMT3A NM_175629 NP_783328 912
## Using longer transcript NM_022552 for now.
rainfallPlot 繪製 Rainfall plots 觀察 mutation hot spot
- Kataegis
定義為包含六個或更多連續突變的基因組片段,平均突變間距離小於或等於1,00
bp
brca <- system.file("extdata", "brca.maf.gz", package = "maftools")
brca = read.maf(maf = brca, verbose = FALSE)
rainfallPlot(maf = brca, detectChangePoints = TRUE, pointSize = 0.4)## Processing TCGA-A8-A08B..
## Kataegis detected at:
## Chromosome Start_Position End_Position nMuts Avg_intermutation_dist Size
## <num> <num> <num> <int> <num> <num>
## 1: 8 98129348 98133560 7 702.0000 4212
## 2: 8 98398549 98403536 9 623.3750 4987
## 3: 8 98453076 98456466 9 423.7500 3390
## 4: 8 124090377 124096810 22 306.3333 6433
## 5: 12 97436055 97439705 7 608.3333 3650
## 6: 17 29332072 29336153 8 583.0000 4081
## Tumor_Sample_Barcode C>G C>T
## <fctr> <int> <int>
## 1: TCGA-A8-A08B 4 3
## 2: TCGA-A8-A08B 1 8
## 3: TCGA-A8-A08B 1 8
## 4: TCGA-A8-A08B 1 21
## 5: TCGA-A8-A08B 4 3
## 6: TCGA-A8-A08B 4 4
## [1] TRUE
laml.mutload = tcgaCompare(maf = laml, cohortName = 'Example-LAML', logscale = TRUE, capture_size = 50)## Warning in FUN(X[[i]], ...): Removed 1 samples with zero mutations.
## Capture size [TCGA]: 35.8
## Capture size [Input]: 50
## Performing pairwise t-test for differences in mutation burden (per MB)..
gistic_res_folder <- system.file("extdata", package = "maftools")
laml.gistic = readGistic(gisticDir = gistic_res_folder, isTCGA = TRUE)## -Processing Gistic files..
## --Processing amp_genes.conf_99.txt
## --Processing del_genes.conf_99.txt
## --Processing scores.gistic
## --Summarizing by samples
## An object of class GISTIC
## ID summary
## <char> <num>
## 1: Samples 191
## 2: nGenes 2622
## 3: cytoBands 16
## 4: Amp 388
## 5: Del 26481
## 6: total 26869
laml.seg <- system.file("extdata", "LAML_CBS_segments.tsv.gz", package = "maftools")
segSummarize_results = segSummarize(seg = laml.seg)## Recurrent chromosomal arm aberrations
## arm Variant_Classification N
## <char> <char> <int>
## 1: 5_q Loss 9
## 2: 16_q Loss 2
## 3: 21_q Gain 2
## 4: 7_q Loss 2
## 5: 1_p Gain 1
## 6: 1_q Gain 1
## 7: 11_p Loss 1
## 8: 11_q Gain 1
## 9: 12_p Loss 1
## 10: 18_p Loss 1
## 11: 19_q Gain 1
## 12: 19_p Gain 1
## 13: 20_q Loss 1
## 14: 3_p Loss 1
## 15: 7_p Loss 1
## 16: 8_q Gain 1
## 17: 9_q Loss 1
#exclusive/co-occurance event analysis on top 10 mutated genes.
somaticInteractions(maf = laml, top = 25, pvalue = c(0.05, 0.1))## gene1 gene2 pValue oddsRatio 00 01 11 10 pAdj
## <char> <char> <num> <num> <int> <int> <int> <int> <num>
## 1: ASXL1 RUNX1 0.0001541586 55.215541 176 12 4 1 0.003568486
## 2: IDH2 RUNX1 0.0002809928 9.590877 164 9 7 13 0.006055880
## 3: IDH2 ASXL1 0.0004030636 41.077327 172 1 4 16 0.008126283
## 4: FLT3 NPM1 0.0009929836 3.763161 125 16 17 35 0.018664260
## 5: SMC3 DNMT3A 0.0010451985 20.177713 144 42 6 1 0.018664260
## ---
## 296: PLCE1 ASXL1 1.0000000000 0.000000 184 5 0 4 1.000000000
## 297: RAD21 FAM5C 1.0000000000 0.000000 183 5 0 5 1.000000000
## 298: PLCE1 FAM5C 1.0000000000 0.000000 184 5 0 4 1.000000000
## 299: PLCE1 RAD21 1.0000000000 0.000000 184 5 0 4 1.000000000
## 300: EZH2 PLCE1 1.0000000000 0.000000 186 4 0 3 1.000000000
## Event pair event_ratio
## <char> <char> <char>
## 1: Co_Occurence ASXL1, RUNX1 4/13
## 2: Co_Occurence IDH2, RUNX1 7/22
## 3: Co_Occurence ASXL1, IDH2 4/17
## 4: Co_Occurence FLT3, NPM1 17/51
## 5: Co_Occurence DNMT3A, SMC3 6/43
## ---
## 296: Mutually_Exclusive ASXL1, PLCE1 0/9
## 297: Mutually_Exclusive FAM5C, RAD21 0/10
## 298: Mutually_Exclusive FAM5C, PLCE1 0/9
## 299: Mutually_Exclusive PLCE1, RAD21 0/9
## 300: Mutually_Exclusive EZH2, PLCE1 0/7
## Warning in oncodrive(maf = laml, AACol = "Protein_Change", minMut = 5,
## pvalMethod = "zscore"): Oncodrive has been superseeded by OncodriveCLUSTL. See
## http://bg.upf.edu/group/projects/oncodrive-clust.php
## Estimating background scores from synonymous variants..
## Not enough genes to build background. Using predefined values. (Mean = 0.279; SD = 0.13)
## Estimating cluster scores from non-syn variants..
## | | | 0% | |=== | 4% | |====== | 9% | |========= | 13% | |============ | 17% | |=============== | 22% | |================== | 26% | |===================== | 30% | |======================== | 35% | |=========================== | 39% | |============================== | 43% | |================================= | 48% | |===================================== | 52% | |======================================== | 57% | |=========================================== | 61% | |============================================== | 65% | |================================================= | 70% | |==================================================== | 74% | |======================================================= | 78% | |========================================================== | 83% | |============================================================= | 87% | |================================================================ | 91% | |=================================================================== | 96% | |======================================================================| 100%
## Comapring with background model and estimating p-values..
## Done !
## Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins
## <char> <int> <int> <int> <int>
## 1: IDH1 0 0 0 0
## 2: IDH2 0 0 0 0
## 3: NPM1 0 33 0 0
## 4: NRAS 0 0 0 0
## 5: U2AF1 0 0 0 0
## 6: KIT 1 1 0 1
## Missense_Mutation Nonsense_Mutation Splice_Site total MutatedSamples
## <int> <int> <int> <num> <int>
## 1: 18 0 0 18 18
## 2: 20 0 0 20 20
## 3: 1 0 0 34 33
## 4: 15 0 0 15 15
## 5: 8 0 0 8 8
## 6: 7 0 0 10 8
## AlteredSamples clusters muts_in_clusters clusterScores protLen zscore
## <int> <int> <int> <num> <int> <num>
## 1: 18 1 18 1.0000000 414 5.546154
## 2: 20 2 20 1.0000000 452 5.546154
## 3: 33 2 32 0.9411765 294 5.093665
## 4: 15 2 15 0.9218951 189 4.945347
## 5: 8 1 7 0.8750000 240 4.584615
## 6: 8 2 9 0.8500000 976 4.392308
## pval fdr fract_muts_in_clusters
## <num> <num> <num>
## 1: 1.460110e-08 1.022077e-07 1.0000000
## 2: 1.460110e-08 1.022077e-07 1.0000000
## 3: 1.756034e-07 8.194826e-07 0.9411765
## 4: 3.800413e-07 1.330144e-06 1.0000000
## 5: 2.274114e-06 6.367520e-06 0.8750000
## 6: 5.607691e-06 1.308461e-05 0.9000000
## Warning in pfamDomains(maf = laml, AACol = "Protein_Change", top = 10): Removed
## 50 mutations for which AA position was not available
#Protein summary (Printing first 7 columns for display convenience)
laml.pfam$proteinSummary[,1:7, with = FALSE]## HGNC AAPos Variant_Classification N total fraction DomainLabel
## <char> <num> <fctr> <int> <num> <num> <char>
## 1: DNMT3A 882 Missense_Mutation 27 54 0.5000000 AdoMet_MTases
## 2: IDH1 132 Missense_Mutation 18 18 1.0000000 PTZ00435
## 3: IDH2 140 Missense_Mutation 17 20 0.8500000 PTZ00435
## 4: FLT3 835 Missense_Mutation 14 52 0.2692308 PKc_like
## 5: FLT3 599 In_Frame_Ins 10 52 0.1923077 PKc_like
## ---
## 1512: ZNF646 875 Missense_Mutation 1 1 1.0000000 <NA>
## 1513: ZNF687 554 Missense_Mutation 1 2 0.5000000 <NA>
## 1514: ZNF687 363 Missense_Mutation 1 2 0.5000000 <NA>
## 1515: ZNF75D 5 Missense_Mutation 1 1 1.0000000 <NA>
## 1516: ZNF827 427 Frame_Shift_Del 1 1 1.0000000 <NA>
#Domain summary (Printing first 3 columns for display convenience)
laml.pfam$domainSummary[,1:3, with = FALSE]## DomainLabel nMuts nGenes
## <char> <int> <int>
## 1: PKc_like 55 5
## 2: PTZ00435 38 2
## 3: AdoMet_MTases 33 1
## 4: 7tm_1 24 24
## 5: COG5048 17 17
## ---
## 499: ribokinase 1 1
## 500: rim_protein 1 1
## 501: sigpep_I_bact 1 1
## 502: trp 1 1
## 503: zf-BED 1 1
#Survival analysis based on grouping of DNMT3A mutation status
mafSurvival(maf = laml, genes = 'DNMT3A', time = 'days_to_last_followup', Status = 'Overall_Survival_Status', isTCGA = TRUE)## Looking for clinical data in annoatation slot of MAF..
## Number of mutated samples for given genes:
## DNMT3A
## 48
## Removed 11 samples with NA's
## Median survival..
## Group medianTime N
## <char> <num> <int>
## 1: Mutant 245 45
## 2: WT 396 137
#Using top 20 mutated genes to identify a set of genes (of size 2) to predict poor prognostic groups
prog_geneset = survGroup(maf = laml, top = 20, geneSetSize = 2, time = "days_to_last_followup", Status = "Overall_Survival_Status", verbose = FALSE)## Removed 11 samples with NA's
## Gene_combination P_value hr WT Mutant
## <char> <num> <num> <int> <int>
## 1: FLT3_DNMT3A 0.00104 2.510 164 18
## 2: DNMT3A_SMC3 0.04880 2.220 176 6
## 3: DNMT3A_NPM1 0.07190 1.720 166 16
## 4: DNMT3A_TET2 0.19600 1.780 176 6
## 5: FLT3_TET2 0.20700 1.860 177 5
## 6: NPM1_IDH1 0.21900 0.495 176 6
## 7: DNMT3A_IDH1 0.29300 1.500 173 9
## 8: IDH2_RUNX1 0.31800 1.580 176 6
## 9: FLT3_NPM1 0.53600 1.210 165 17
## 10: DNMT3A_IDH2 0.68000 0.747 178 4
## 11: DNMT3A_NRAS 0.99200 0.986 178 4
mafSurvGroup(maf = laml, geneSet = c("DNMT3A", "FLT3"), time = "days_to_last_followup", Status = "Overall_Survival_Status")## Looking for clinical data in annoatation slot of MAF..
## Removed 11 samples with NA's
## Median survival..
## Group medianTime N
## <char> <num> <int>
## 1: Mutant 242.5 18
## 2: WT 379.5 164
#Primary APL MAF
primary.apl = system.file("extdata", "APL_primary.maf.gz", package = "maftools")
primary.apl = read.maf(maf = primary.apl)## -Reading
## -Validating
## --Non MAF specific values in Variant_Classification column:
## ITD
## -Silent variants: 45
## -Summarizing
## -Processing clinical data
## --Missing clinical data
## -Finished in 0.063s elapsed (0.263s cpu)
#Relapse APL MAF
relapse.apl = system.file("extdata", "APL_relapse.maf.gz", package = "maftools")
relapse.apl = read.maf(maf = relapse.apl)## -Reading
## -Validating
## --Non MAF specific values in Variant_Classification column:
## ITD
## -Silent variants: 19
## -Summarizing
## -Processing clinical data
## --Missing clinical data
## -Finished in 0.046s elapsed (0.201s cpu)
#Considering only genes which are mutated in at-least in 5 samples in one of the cohort to avoid bias due to genes mutated in single sample.
pt.vs.rt <- mafCompare(m1 = primary.apl, m2 = relapse.apl, m1Name = 'Primary', m2Name = 'Relapse', minMut = 5)
print(pt.vs.rt)## $results
## Hugo_Symbol Primary Relapse pval or ci.up ci.low
## <char> <num> <int> <num> <num> <num> <num>
## 1: PML 1 11 1.529935e-05 0.03537381 0.2552937 0.000806034
## 2: RARA 0 7 2.574810e-04 0.00000000 0.3006159 0.000000000
## 3: RUNX1 1 5 1.310500e-02 0.08740567 0.8076265 0.001813280
## 4: FLT3 26 4 1.812779e-02 3.56086275 14.7701728 1.149009169
## 5: ARID1B 5 8 2.758396e-02 0.26480490 0.9698686 0.064804160
## 6: WT1 20 14 2.229087e-01 0.60619329 1.4223101 0.263440988
## 7: KRAS 6 1 4.334067e-01 2.88486293 135.5393108 0.337679367
## 8: NRAS 15 4 4.353567e-01 1.85209500 8.0373994 0.553883512
## 9: ARID1A 7 4 7.457274e-01 0.80869223 3.9297309 0.195710173
## adjPval
## <num>
## 1: 0.0001376942
## 2: 0.0011586643
## 3: 0.0393149868
## 4: 0.0407875250
## 5: 0.0496511201
## 6: 0.3343630535
## 7: 0.4897762916
## 8: 0.4897762916
## 9: 0.7457273717
##
## $SampleSummary
## Cohort SampleSize
## <char> <num>
## 1: Primary 124
## 2: Relapse 58
genes = c("PML", "RARA", "RUNX1", "ARID1B", "FLT3")
coOncoplot(m1 = primary.apl, m2 = relapse.apl, m1Name = 'PrimaryAPL', m2Name = 'RelapseAPL', genes = genes, removeNonMutated = TRUE)lollipopPlot2(m1 = primary.apl, m2 = relapse.apl, gene = "PML", AACol1 = "amino_acid_change", AACol2 = "amino_acid_change", m1_name = "Primary", m2_name = "Relapse")## Gene: PML
## 9 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## <char> <char> <char> <num>
## 1: PML NM_002675 NP_002666 633
## 2: PML NM_033238 NP_150241 882
## 3: PML NM_033239 NP_150242 829
## 4: PML NM_033240 NP_150243 611
## 5: PML NM_033244 NP_150247 560
## 6: PML NM_033246 NP_150249 423
## 7: PML NM_033247 NP_150250 435
## 8: PML NM_033249 NP_150252 585
## 9: PML NM_033250 NP_150253 781
## Using longer transcript NM_033238 for now.
## 9 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## <char> <char> <char> <num>
## 1: PML NM_002675 NP_002666 633
## 2: PML NM_033238 NP_150241 882
## 3: PML NM_033239 NP_150242 829
## 4: PML NM_033240 NP_150243 611
## 5: PML NM_033244 NP_150247 560
## 6: PML NM_033246 NP_150249 423
## 7: PML NM_033247 NP_150250 435
## 8: PML NM_033249 NP_150252 585
## 9: PML NM_033250 NP_150253 781
## Using longer transcript NM_033238 for now.
## Sample size per factor in FAB_classification:
##
## M0 M1 M2 M3 M4 M5 M6 M7
## 19 44 44 21 39 19 3 3
#Results are returned as a list. Significant associations p-value < 0.05
fab.ce$groupwise_comparision[p_value < 0.05]## Hugo_Symbol Group1 Group2 n_mutated_group1 n_mutated_group2 p_value
## <char> <char> <char> <char> <char> <num>
## 1: IDH1 M1 Rest 11 of 44 7 of 149 0.0002597371
## 2: TP53 M7 Rest 3 of 3 12 of 190 0.0003857187
## 3: DNMT3A M5 Rest 10 of 19 38 of 174 0.0089427384
## 4: CEBPA M2 Rest 7 of 44 6 of 149 0.0117352110
## 5: RUNX1 M0 Rest 5 of 19 11 of 174 0.0117436825
## 6: NPM1 M5 Rest 7 of 19 26 of 174 0.0248582372
## 7: NPM1 M3 Rest 0 of 21 33 of 172 0.0278630823
## 8: DNMT3A M3 Rest 1 of 21 47 of 172 0.0294005111
## OR OR_low OR_high fdr
## <num> <num> <num> <num>
## 1: 6.670592 2.173829026 21.9607250 0.0308575
## 2: Inf 5.355415451 Inf 0.0308575
## 3: 3.941207 1.333635173 11.8455979 0.3757978
## 4: 4.463237 1.204699322 17.1341278 0.3757978
## 5: 5.216902 1.243812880 19.4051505 0.3757978
## 6: 3.293201 1.001404899 10.1210509 0.5880102
## 7: 0.000000 0.000000000 0.8651972 0.5880102
## 8: 0.133827 0.003146708 0.8848897 0.5880102
## Number of claimed drugs for given genes:
## Gene N
## <char> <int>
## 1: DNMT3A 7
## Gene interaction_types drug_name drug_claim_name
## <char> <char> <char> <char>
## 1: DNMT3A N/A
## 2: DNMT3A DAUNORUBICIN Daunorubicin
## 3: DNMT3A DECITABINE Decitabine
## 4: DNMT3A IDARUBICIN IDARUBICIN
## 5: DNMT3A DECITABINE DECITABINE
## 6: DNMT3A inhibitor DECITABINE CHEMBL1201129
## 7: DNMT3A inhibitor AZACITIDINE CHEMBL1489
## Summarizing signalling pathways [Sanchez-Vega et al., https://doi.org/10.1016/j.cell.2018.03.035]
if (!requireNamespace("mclust", quietly = TRUE)) {
BiocManager::install("mclust", update = FALSE, ask = FALSE)
}
#Heterogeneity in sample TCGA.AB.2972
library("mclust")## Package 'mclust' version 6.1.2
## Type 'citation("mclust")' for citing this R package in publications.
## Processing TCGA-AB-2972..
## Tumor_Sample_Barcode cluster meanVaf
## <fctr> <char> <num>
## 1: TCGA-AB-2972 2 0.4496571
## 2: TCGA-AB-2972 1 0.2454750
## 3: TCGA-AB-2972 outlier 0.3695000
seg = system.file('extdata', 'TCGA.AB.3009.hg19.seg.txt', package = 'maftools')
tcga.ab.3009.het = inferHeterogeneity(maf = laml, tsb = 'TCGA-AB-3009', segFile = seg, vafCol = 'i_TumorVAF_WU')## Processing TCGA-AB-3009..
## Removed 1 variants with no copy number data.
## Hugo_Symbol Chromosome Start_Position End_Position Tumor_Sample_Barcode
## <char> <char> <num> <num> <fctr>
## 1: PHF6 23 133551224 133551224 TCGA-AB-3009
## t_vaf Segment_Start Segment_End Segment_Mean CN
## <num> <int> <int> <num> <num>
## 1: 0.9349112 NA NA NA NA
## Copy number altered variants:
## Hugo_Symbol Chromosome Start_Position End_Position Tumor_Sample_Barcode
## <char> <char> <num> <num> <fctr>
## 1: NFKBIL2 8 145668658 145668658 TCGA-AB-3009
## 2: NF1 17 29562981 29562981 TCGA-AB-3009
## 3: SUZ12 17 30293198 30293198 TCGA-AB-3009
## t_vaf Segment_Start Segment_End Segment_Mean CN cluster
## <num> <int> <int> <num> <num> <char>
## 1: 0.4415584 145232496 145760746 0.3976 2.634629 CN_altered
## 2: 0.8419000 29054355 30363868 -0.9157 1.060173 CN_altered
## 3: 0.8958333 29054355 30363868 -0.9157 1.060173 CN_altered
#Visualizing results. Highlighting those variants on copynumber altered variants.
plotClusters(clusters = tcga.ab.3009.het, genes = 'CN_altered', showCNvars = TRUE)if (!requireNamespace("BSgenome.Hsapiens.UCSC.hg19", quietly = TRUE)) {
BiocManager::install("BSgenome.Hsapiens.UCSC.hg19", update = FALSE, ask = FALSE)
}
library("BSgenome.Hsapiens.UCSC.hg19")## Loading required package: BSgenome
## Loading required package: BiocGenerics
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, saveRDS, setdiff,
## table, tapply, union, unique, unsplit, which.max, which.min
## Loading required package: S4Vectors
## Loading required package: stats4
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:utils':
##
## findMatches
## The following objects are masked from 'package:base':
##
## expand.grid, I, unname
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
## Loading required package: GenomicRanges
## Loading required package: Biostrings
## Loading required package: XVector
##
## Attaching package: 'Biostrings'
## The following object is masked from 'package:base':
##
## strsplit
## Loading required package: BiocIO
## Loading required package: rtracklayer
##
## Attaching package: 'rtracklayer'
## The following object is masked from 'package:BiocIO':
##
## FileForFormat
laml.tnm = trinucleotideMatrix(maf = laml, prefix = 'chr', add = TRUE, ref_genome = "BSgenome.Hsapiens.UCSC.hg19")## Warning in trinucleotideMatrix(maf = laml, prefix = "chr", add = TRUE, ref_genome = "BSgenome.Hsapiens.UCSC.hg19"): Chromosome names in MAF must match chromosome names in reference genome.
## Ignorinig 101 single nucleotide variants from missing chromosomes chr23
## -Extracting 5' and 3' adjacent bases
## -Extracting +/- 20bp around mutated bases for background C>T estimation
## -Estimating APOBEC enrichment scores
## --Performing one-way Fisher's test for APOBEC enrichment
## ---APOBEC related mutations are enriched in 3.315 % of samples (APOBEC enrichment score > 2 ; 6 of 181 samples)
## -Creating mutation matrix
## --matrix of dimension 188x96
## -Processing clinical data
## -Processing clinical data
## $results
## Index: <Hugo_Symbol>
## Hugo_Symbol Enriched nonEnriched pval or ci.up ci.low
## <char> <num> <int> <num> <num> <num> <num>
## 1: TP53 2 13 0.08175632 5.997646 46.60886 0.49875432
## 2: TET2 1 16 0.45739351 1.940700 18.98398 0.03882963
## 3: FLT3 2 45 0.65523131 1.408185 10.21162 0.12341748
## 4: ADAM11 0 2 1.00000000 0.000000 164.19147 0.00000000
## 5: APOB 0 2 1.00000000 0.000000 164.19147 0.00000000
## ---
## 132: WAC 0 2 1.00000000 0.000000 164.19147 0.00000000
## 133: WT1 0 12 1.00000000 0.000000 12.69086 0.00000000
## 134: ZBTB33 0 2 1.00000000 0.000000 164.19147 0.00000000
## 135: ZC3H18 0 2 1.00000000 0.000000 164.19147 0.00000000
## 136: ZNF687 0 2 1.00000000 0.000000 164.19147 0.00000000
## adjPval
## <num>
## 1: 1
## 2: 1
## 3: 1
## 4: 1
## 5: 1
## ---
## 132: 1
## 133: 1
## 134: 1
## 135: 1
## 136: 1
##
## $SampleSummary
## Key: <Cohort>
## Cohort SampleSize Mean Median
## <char> <num> <num> <num>
## 1: Enriched 6 7.167 6.5
## 2: nonEnriched 172 9.715 9.0
packages <- c("NMF", "pheatmap")
for (pkg in packages) {
if (!requireNamespace(pkg, quietly = TRUE)) {
install.packages(pkg)
}
}
library('NMF')## Loading required package: pkgmaker
## Loading required package: registry
##
## Attaching package: 'pkgmaker'
## The following object is masked from 'package:S4Vectors':
##
## new2
## Loading required package: rngtools
## Loading required package: cluster
## NMF - BioConductor layer [OK] | Shared memory capabilities [NO: bigmemory] | Cores 27/28
## To enable shared memory capabilities, try: install.extras('
## NMF
## ')
##
## Attaching package: 'NMF'
## The following object is masked from 'package:S4Vectors':
##
## nrun
## -Running NMF for 6 ranks
## Compute NMF rank= 2 ... + measures ... OK
## Compute NMF rank= 3 ... + measures ... OK
## Compute NMF rank= 4 ... + measures ... OK
## Compute NMF rank= 5 ... + measures ... OK
## Compute NMF rank= 6 ... + measures ... OK
## -Finished in 26.6s elapsed (6.347s cpu)
## -Running NMF for factorization rank: 3
## -Finished in1.211s elapsed (1.144s cpu)
#Compate against original 30 signatures
laml.og30.cosm = compareSignatures(nmfRes = laml.sig, sig_db = "legacy")## -Comparing against COSMIC signatures
## ------------------------------------
## --Found Signature_1 most similar to COSMIC_1
## Aetiology: spontaneous deamination of 5-methylcytosine [cosine-similarity: 0.738]
## --Found Signature_2 most similar to COSMIC_19
## Aetiology: Unknown [cosine-similarity: 0.665]
## --Found Signature_3 most similar to COSMIC_1
## Aetiology: spontaneous deamination of 5-methylcytosine [cosine-similarity: 0.682]
## ------------------------------------
#Compate against updated version3 60 signatures
laml.v3.cosm = compareSignatures(nmfRes = laml.sig, sig_db = "SBS")## -Comparing against COSMIC signatures
## ------------------------------------
## --Found Signature_1 most similar to SBS15
## Aetiology: Defective DNA mismatch repair [cosine-similarity: 0.816]
## --Found Signature_2 most similar to SBS5
## Aetiology: Unknown [cosine-similarity: 0.64]
## --Found Signature_3 most similar to SBS6
## Aetiology: defective DNA mismatch repair [cosine-similarity: 0.619]
## ------------------------------------
library('pheatmap')
pheatmap::pheatmap(mat = laml.og30.cosm$cosine_similarities, cluster_rows = FALSE, main = "cosine similarity against validated signatures")var.annovar = system.file("extdata", "variants.hg19_multianno.txt", package = "maftools")
var.annovar.maf = annovarToMaf(annovar = var.annovar, Center = 'CSI-NUS', refBuild = 'hg19',
tsbCol = 'Tumor_Sample_Barcode', table = 'ensGene')## -Reading annovar files
## --Extracting tx, exon, txchange and aa-change
## -Adding Variant_Type
## -Converting Ensemble Gene IDs into HGNC gene symbols
## --Done. Original ensemble gene IDs are preserved under field name ens_id
## Finished in 0.083s elapsed (0.327s cpu)
#Read sample ICGC data for ESCA
esca.icgc <- system.file("extdata", "simple_somatic_mutation.open.ESCA-CN.sample.tsv.gz", package = "maftools")
esca.maf <- icgcSimpleMutationToMAF(icgc = esca.icgc, addHugoSymbol = TRUE)## Converting Ensemble Gene IDs into HGNC gene symbols.
## Done! Original ensemble gene IDs are preserved under field name ens_id
## --Removed 427 duplicated variants
## Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome Start_Position
## <char> <int> <lgcl> <char> <char> <num>
## 1: AC005237.4 NA NA GRCh37 2 241987787
## 2: AC061992.1 786 NA GRCh37 17 76425382
## 3: AC097467.2 NA NA GRCh37 4 156294567
## 4: ADAMTS12 NA NA GRCh37 5 33684019
## 5: AL589642.1 54801 NA GRCh37 9 32630154
## End_Position Strand Variant_Classification Variant_Type Reference_Allele
## <num> <char> <char> <char> <char>
## 1: 241987787 + Intron SNP C
## 2: 76425382 + 3'Flank SNP C
## 3: 156294567 + Intron SNP A
## 4: 33684019 + Missense_Mutation SNP A
## 5: 32630154 + RNA SNP T
## Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS dbSNP_Val_Status
## <char> <char> <lgcl> <lgcl>
## 1: C T NA NA
## 2: C T NA NA
## 3: A G NA NA
## 4: A C NA NA
## 5: T C NA NA
## Tumor_Sample_Barcode
## <fctr>
## 1: SA514619
## 2: SA514619
## 3: SA514619
## 4: SA514619
## 5: SA514619
## Converting gene names for 1 variants from 1 genes
## Key: <MutSig_Synonym>
## Hugo_Symbol MutSig_Synonym N
## <char> <char> <int>
## 1: ARHGAP35 GRLF1 1
## Original symbols are preserved under column OG_Hugo_Symbol.
#Extract data for samples 'TCGA.AB.3009' and 'TCGA.AB.2933' (Printing just 5 rows for display convenience)
subsetMaf(maf = laml, tsb = c('TCGA-AB-3009', 'TCGA-AB-2933'), mafObj = FALSE)[1:5]## Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome
## <char> <int> <char> <int> <char>
## 1: ABCB11 8647 genome.wustl.edu 37 2
## 2: ACSS3 79611 genome.wustl.edu 37 12
## 3: ANK3 288 genome.wustl.edu 37 10
## 4: AP1G2 8906 genome.wustl.edu 37 14
## 5: ARC 23237 genome.wustl.edu 37 8
## Start_Position End_Position Strand Variant_Classification Variant_Type
## <num> <num> <char> <fctr> <fctr>
## 1: 169780250 169780250 + Missense_Mutation SNP
## 2: 81536902 81536902 + Missense_Mutation SNP
## 3: 61926581 61926581 + Splice_Site SNP
## 4: 24033309 24033309 + Missense_Mutation SNP
## 5: 143694874 143694874 + Missense_Mutation SNP
## Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode
## <char> <char> <char> <fctr>
## 1: G G A TCGA-AB-3009
## 2: C C T TCGA-AB-3009
## 3: C C A TCGA-AB-3009
## 4: C C T TCGA-AB-3009
## 5: C C G TCGA-AB-3009
## Protein_Change i_TumorVAF_WU i_transcript_name
## <char> <num> <char>
## 1: p.A1283V 46.97218 NM_003742.2
## 2: p.A266V 47.32510 NM_024560.2
## 3: 43.99478 NM_020987.2
## 4: p.R346Q 47.08000 NM_003917.2
## 5: p.W253C 42.96435 NM_015193.3
##Same as above but return output as an MAF object (Default behaviour)
subsetMaf(maf = laml, tsb = c('TCGA-AB-3009', 'TCGA-AB-2933'))## -Processing clinical data
## An object of class MAF
## ID summary Mean Median
## <char> <char> <num> <num>
## 1: NCBI_Build 37 NA NA
## 2: Center genome.wustl.edu NA NA
## 3: Samples 2 NA NA
## 4: nGenes 34 NA NA
## 5: Frame_Shift_Ins 5 2.5 2.5
## 6: In_Frame_Ins 1 0.5 0.5
## 7: Missense_Mutation 26 13.0 13.0
## 8: Nonsense_Mutation 2 1.0 1.0
## 9: Splice_Site 1 0.5 0.5
## 10: total 35 17.5 17.5
#Select all Splice_Site mutations from DNMT3A and NPM1
subsetMaf(maf = laml, genes = c('DNMT3A', 'NPM1'), mafObj = FALSE,query = "Variant_Classification == 'Splice_Site'")## Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome
## <char> <int> <char> <int> <char>
## 1: DNMT3A 1788 genome.wustl.edu 37 2
## 2: DNMT3A 1788 genome.wustl.edu 37 2
## 3: DNMT3A 1788 genome.wustl.edu 37 2
## 4: DNMT3A 1788 genome.wustl.edu 37 2
## 5: DNMT3A 1788 genome.wustl.edu 37 2
## 6: DNMT3A 1788 genome.wustl.edu 37 2
## Start_Position End_Position Strand Variant_Classification Variant_Type
## <num> <num> <char> <fctr> <fctr>
## 1: 25459874 25459874 + Splice_Site SNP
## 2: 25467208 25467208 + Splice_Site SNP
## 3: 25467022 25467022 + Splice_Site SNP
## 4: 25459803 25459803 + Splice_Site SNP
## 5: 25464576 25464576 + Splice_Site SNP
## 6: 25469029 25469029 + Splice_Site SNP
## Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode
## <char> <char> <char> <fctr>
## 1: C C G TCGA-AB-2818
## 2: C C T TCGA-AB-2822
## 3: A A G TCGA-AB-2891
## 4: A A C TCGA-AB-2898
## 5: C C A TCGA-AB-2934
## 6: C C A TCGA-AB-2968
## Protein_Change i_TumorVAF_WU i_transcript_name
## <char> <num> <char>
## 1: p.R803S 36.84 NM_022552.3
## 2: 62.96 NM_022552.3
## 3: 34.78 NM_022552.3
## 4: 46.43 NM_022552.3
## 5: p.G646V 37.50 NM_022552.3
## 6: p.E477* 40.00 NM_022552.3
#Same as above but include only 'i_transcript_name' column in the output
subsetMaf(maf = laml, genes = c('DNMT3A', 'NPM1'), mafObj = FALSE, query = "Variant_Classification == 'Splice_Site'", fields = 'i_transcript_name')## Hugo_Symbol Chromosome Start_Position End_Position Reference_Allele
## <char> <char> <num> <num> <char>
## 1: DNMT3A 2 25459874 25459874 C
## 2: DNMT3A 2 25467208 25467208 C
## 3: DNMT3A 2 25467022 25467022 A
## 4: DNMT3A 2 25459803 25459803 A
## 5: DNMT3A 2 25464576 25464576 C
## 6: DNMT3A 2 25469029 25469029 C
## Tumor_Seq_Allele2 Variant_Classification Variant_Type Tumor_Sample_Barcode
## <char> <fctr> <fctr> <fctr>
## 1: G Splice_Site SNP TCGA-AB-2818
## 2: T Splice_Site SNP TCGA-AB-2822
## 3: G Splice_Site SNP TCGA-AB-2891
## 4: C Splice_Site SNP TCGA-AB-2898
## 5: A Splice_Site SNP TCGA-AB-2934
## 6: A Splice_Site SNP TCGA-AB-2968
## i_transcript_name
## <char>
## 1: NM_022552.3
## 2: NM_022552.3
## 3: NM_022552.3
## 4: NM_022552.3
## 5: NM_022552.3
## 6: NM_022552.3
#Path to BAM files
bams = c(
"DBW-40-N.bam",
"DBW-40-1T.bam",
"DBW-40-2T.bam",
"DBW-40-3T.bam",
"DBW-43-N.bam",
"DBW-43-1T.bam"
)
res = maftools::sampleSwaps(bams = bams, build = "hg19")
# Fetching readcounts from BAM files..
# Summarizing allele frequncy table..
# Performing pairwise comparison..
# Done!
res$pairwise_comparisonThe returned results is a list containing:
matrix of allele frequency table for every genotyped
SNPdata.frame of readcounts for ref and alt allele for
every genotyped SNP# X_bam Y_bam concordant_snps discordant_snps fract_concordant_snps cor_coef XY_possibly_paired
# 1: DBW-40-1T DBW-40-2T 5488 571 0.9057600 0.9656484 Yes
# 2: DBW-40-1T DBW-40-3T 5793 266 0.9560984 0.9758083 Yes
# 3: DBW-40-1T DBW-43-N 5534 525 0.9133520 0.9667620 Yes
# 4: DBW-40-2T DBW-40-3T 5853 206 0.9660010 0.9817475 Yes
# 5: DBW-40-2T DBW-43-N 5131 928 0.8468394 0.9297096 Yes
# 6: DBW-40-3T DBW-43-N 5334 725 0.8803433 0.9550670 Yes
# 7: DBW-40-N DBW-43-1T 5709 350 0.9422347 0.9725684 Yes
# 8: DBW-40-1T DBW-40-N 2829 3230 0.4669087 0.3808831 No
# 9: DBW-40-1T DBW-43-1T 2796 3263 0.4614623 0.3755364 No
# 10: DBW-40-2T DBW-40-N 2760 3299 0.4555207 0.3641647 No
# 11: DBW-40-2T DBW-43-1T 2736 3323 0.4515597 0.3579747 No
# 12: DBW-40-3T DBW-40-N 2775 3284 0.4579964 0.3770581 No
# 13: DBW-40-3T DBW-43-1T 2753 3306 0.4543654 0.3721022 No
# 14: DBW-40-N DBW-43-N 2965 3094 0.4893547 0.3839140 No
# 15: DBW-43-1T DBW-43-N 2876 3183 0.4746658 0.3797829 No# [[1]]
# [1] "DBW-40-1T" "DBW-40-2T" "DBW-40-3T" "DBW-43-N"
#
# [[2]]
# [1] "DBW-40-2T" "DBW-40-3T" "DBW-43-N"
#
# [[3]]
# [1] "DBW-40-3T" "DBW-43-N"
#
# [[4]]
# [1] "DBW-40-N" "DBW-43-1T"Results can be visualized with the correlation plot.
## Study_Abbreviation Study_Name MC3
## <char> <char> <int>
## 1: ACC Adrenocortical_carcinoma 92
## 2: BLCA Bladder_Urothelial_Carcinoma 411
## 3: BRCA Breast_invasive_carcinoma 1026
## Firehose CCLE
## <char> <char>
## 1: 62 [dx.doi.org/10.7908/C1610ZNC]
## 2: 395 [dx.doi.org/10.7908/C1MW2GGF]
## 3: 978 [dx.doi.org/10.7908/C1TB167Z]
## Loading LAML. Please cite: https://doi.org/10.1016/j.cels.2018.03.002 for reference
## An object of class MAF
## ID summary Mean Median
## <char> <char> <num> <num>
## 1: NCBI_Build GRCh37 NA NA
## 2: Center MC3_LAML NA NA
## 3: Samples 140 NA NA
## 4: nGenes 4142 NA NA
## 5: Frame_Shift_Del 131 0.936 0.0
## 6: Frame_Shift_Ins 377 2.693 0.0
## 7: In_Frame_Del 9 0.064 0.0
## 8: In_Frame_Ins 3 0.021 0.0
## 9: Missense_Mutation 4137 29.550 7.5
## 10: Nonsense_Mutation 264 1.886 0.0
## 11: Nonstop_Mutation 18 0.129 0.0
## 12: Splice_Site 780 5.571 1.0
## 13: Translation_Start_Site 4 0.029 0.0
## 14: total 5723 40.879 13.0
## Loading LAML. Please cite: dx.doi.org/10.7908/C1D21X2X for reference
## An object of class MAF
## ID summary Mean Median
## <char> <char> <num> <num>
## 1: NCBI_Build 37 NA NA
## 2: Center genome.wustl.edu NA NA
## 3: Samples 192 NA NA
## 4: nGenes 1241 NA NA
## 5: Frame_Shift_Del 52 0.271 0
## 6: Frame_Shift_Ins 91 0.474 0
## 7: In_Frame_Del 10 0.052 0
## 8: In_Frame_Ins 42 0.219 0
## 9: Missense_Mutation 1342 6.990 7
## 10: Nonsense_Mutation 103 0.536 0
## 11: Splice_Site 92 0.479 0
## 12: total 1732 9.021 9
## Loading required namespace: RaggedExperiment
## Loading required namespace: MultiAssayExperiment
## A MultiAssayExperiment object of 2 listed
## experiments with user-defined names and respective classes.
## Containing an ExperimentList class object of length 2:
## [1] maf_nonSyn: RaggedExperiment with 1732 rows and 193 columns
## [2] maf_syn: RaggedExperiment with 475 rows and 193 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-conda-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS/LAPACK: /home/benson/miniforge3/envs/ngs-r/lib/libopenblasp-r0.3.30.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Asia/Taipei
## tzcode source: system (glibc)
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] pheatmap_1.0.13 doParallel_1.0.17
## [3] iterators_1.0.14 foreach_1.5.2
## [5] NMF_0.21.0 cluster_2.1.8.1
## [7] rngtools_1.5.2 pkgmaker_0.32.10
## [9] registry_0.5-1 Biobase_2.66.0
## [11] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.74.0
## [13] rtracklayer_1.66.0 BiocIO_1.16.0
## [15] Biostrings_2.74.0 XVector_0.46.0
## [17] GenomicRanges_1.58.0 GenomeInfoDb_1.42.0
## [19] IRanges_2.40.0 S4Vectors_0.44.0
## [21] BiocGenerics_0.52.0 mclust_6.1.2
## [23] maftools_2.22.0 DT_0.34.0
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-9 rlang_1.1.6
## [3] magrittr_2.0.4 gridBase_0.4-7
## [5] otel_0.2.0 matrixStats_1.5.0
## [7] compiler_4.4.3 vctrs_0.6.5
## [9] reshape2_1.4.5 stringr_1.6.0
## [11] pkgconfig_2.0.3 crayon_1.5.3
## [13] fastmap_1.2.0 Rsamtools_2.22.0
## [15] rmarkdown_2.30 UCSC.utils_1.2.0
## [17] xfun_0.55 MultiAssayExperiment_1.32.0
## [19] zlibbioc_1.52.0 cachem_1.1.0
## [21] jsonlite_2.0.0 DelayedArray_0.32.0
## [23] BiocParallel_1.40.0 R6_2.6.1
## [25] bslib_0.9.0 stringi_1.8.7
## [27] RColorBrewer_1.1-3 DNAcopy_1.80.0
## [29] jquerylib_0.1.4 Rcpp_1.1.0
## [31] assertthat_0.2.1 SummarizedExperiment_1.36.0
## [33] knitr_1.51 R.utils_2.13.0
## [35] Matrix_1.7-4 splines_4.4.3
## [37] tidyselect_1.2.1 abind_1.4-8
## [39] yaml_2.3.12 codetools_0.2-20
## [41] curl_7.0.0 lattice_0.22-7
## [43] tibble_3.3.0 plyr_1.8.9
## [45] withr_3.0.2 S7_0.2.1
## [47] evaluate_1.0.5 survival_3.8-3
## [49] pillar_1.11.1 MatrixGenerics_1.18.0
## [51] generics_0.1.4 RCurl_1.98-1.17
## [53] ggplot2_4.0.1 scales_1.4.0
## [55] xtable_1.8-4 glue_1.8.0
## [57] tools_4.4.3 data.table_1.17.8
## [59] GenomicAlignments_1.42.0 XML_3.99-0.20
## [61] grid_4.4.3 crosstalk_1.2.2
## [63] colorspace_2.1-2 GenomeInfoDbData_1.2.13
## [65] RaggedExperiment_1.30.0 restfulr_0.0.16
## [67] cli_3.6.5 S4Arrays_1.6.0
## [69] dplyr_1.1.4 gtable_0.3.6
## [71] R.methodsS3_1.8.2 sass_0.4.10
## [73] digest_0.6.39 SparseArray_1.6.0
## [75] rjson_0.2.23 htmlwidgets_1.6.4
## [77] farver_2.1.2 htmltools_0.5.9
## [79] R.oo_1.27.1 lifecycle_1.0.4
## [81] httr_1.4.7